Structure in the Value Function of Two-Player Zero-Sum Games of Incomplete Information
نویسندگان
چکیده
Decision-making in competitive games with incomplete information is a field with many promising applications for AI, both in games (e.g. poker) and in real-life settings (e.g. security). The most general game-theoretic framework that can be used model such games is the zero-sum Partially Observable Stochastic Game (zs-POSG). While use of this model enables agents to make rational decisions, reasoning about them is challenging: in order to act rationally in a zs-POSG, agents must consider stochastic policies (of which there are infinitely many), and they must take uncertainty about the environment as well as uncertainty about their opponents into account. We aim to make reasoning about this class of models more tractable. We take inspiration from work from the collaborative multi-agent setting, where so-called plan-time sufficient statistics, representing probability distributions over joint sets of private information, have been shown to allow for a reduction from a decentralized model to a centralized one. This leads to increases in scalability, and allows the use of (adapted) solution methods for centralized models in the decentralized setting. We adapt these plan-time sufficient statistics for use in the competitive setting. Not only does this enable reduction from a (decentralized) zs-POSG to a (centralized) Stochastic Game, it turns out that the value function of the zs-POSG, when defined in terms of this new statistic, exhibits a particular concave/convex structure that is similar to the structure found in the collaborative setting. We propose an anytime algorithm that aims to exploit the found structure in order to find bounds on the value function, and evaluate performance of this methods in two domains of our design. As it does not outperform existing solution methods, we analyze its shortcomings, and give possible directions for future research.
منابع مشابه
A TRANSITION FROM TWO-PERSON ZERO-SUM GAMES TO COOPERATIVE GAMES WITH FUZZY PAYOFFS
In this paper, we deal with games with fuzzy payoffs. We proved that players who are playing a zero-sum game with fuzzy payoffs against Nature are able to increase their joint payoff, and hence their individual payoffs by cooperating. It is shown that, a cooperative game with the fuzzy characteristic function can be constructed via the optimal game values of the zero-sum games with fuzzy payoff...
متن کاملMarkov Games with Frequent Actions and Incomplete Information - The Limit Case
We study a two-player, zero-sum, stochastic game with incomplete information on one side in which the players are allowed to play more and more frequently. The informed player observes the realization of a Markov chain on which the payoffs depend, while the non-informed player only observes his opponent’s actions. We show the existence of a limit value as the time span between two consecutive s...
متن کاملMarkov games with frequent actions and incomplete information
We study a two-player, zero-sum, stochastic game with incomplete information on one side in which the players are allowed to play more and more frequently. The informed player observes the realization of a Markov chain on which the payoffs depend, while the non-informed player only observes his opponent’s actions. We show the existence of a limit value as the time span between two consecutive s...
متن کاملOn Repeated Zero-Sum Games with Incomplete Information and Asymptotically Bounded Values
We consider repeated zero-sum games with incomplete information on the side of Player 2 with the total payoff given by the non-normalized sum of stage gains. In the classical examples the value VN of such N-stage game is of the order of N or √ N as N → ∞. Our aim is to present a general framework for another asymptotic behavior of the value VN observed for the discrete version of the financial ...
متن کاملDifferential Games with Incomplete Information on a Continuum of Initial Positions and without Isaacs Condition
This article deals with a two-player zero-sum differential game with infinitely many initial positions and without Isaacs condition. The structure of information is asymmetric: The first player has a private information on the initial position while the second player knows only a probability distribution on the initial position. In the present model, we face two difficulties: First, the incompl...
متن کاملSolving two-person zero-sum repeated games of incomplete information
In repeated games with incomplete information, rational agents must carefully weigh the tradeoffs of advantageously exploiting their information to achieve a short-term gain versus carefully concealing their information so as not to give up a long-term informed advantage. The theory of infinitelyrepeated two-player zero-sum games with incomplete information has been carefully studied, beginning...
متن کامل